28 research outputs found
Discriminative Features via Generalized Eigenvectors
Representing examples in a way that is compatible with the underlying
classifier can greatly enhance the performance of a learning system. In this
paper we investigate scalable techniques for inducing discriminative features
by taking advantage of simple second order structure in the data. We focus on
multiclass classification and show that features extracted from the generalized
eigenvectors of the class conditional second moments lead to classifiers with
excellent empirical performance. Moreover, these features have attractive
theoretical properties, such as inducing representations that are invariant to
linear transformations of the input. We evaluate classifiers built from these
features on three different tasks, obtaining state of the art results
Online Importance Weight Aware Updates
An importance weight quantifies the relative importance of one example over
another, coming up in applications of boosting, asymmetric classification
costs, reductions, and active learning. The standard approach for dealing with
importance weights in gradient descent is via multiplication of the gradient.
We first demonstrate the problems of this approach when importance weights are
large, and argue in favor of more sophisticated ways for dealing with them. We
then develop an approach which enjoys an invariance property: that updating
twice with importance weight is equivalent to updating once with importance
weight . For many important losses this has a closed form update which
satisfies standard regret guarantees when all examples have . We also
briefly discuss two other reasonable approaches for handling large importance
weights. Empirically, these approaches yield substantially superior prediction
with similar computational performance while reducing the sensitivity of the
algorithm to the exact setting of the learning rate. We apply these to online
active learning yielding an extraordinarily fast active learning algorithm that
works even in the presence of adversarial noise